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(57) Abstract 

An asynchronously pipelined SDRAM has separate pipeline stages that are controlled by asynchronous signals. Rather than using 
a clock signal to synchronize data at each stage, an asynchronous signal is used to latch data at every stage. The asynchronous control 
signals are generated within the chip and are optimized to the different latency stages. Longer latency stages require larger delay elements, 
while shorter latency states require shorter delay elements. The data is synchronized to the clock at the end of the read data path before 
being read out of the chip. Because the data has been latched at each pipeline stage, it suffers from less skew than would be seen in a 
conventional wave pipeline architecture. Furthermore, since the stages are independent of the system clock, the read data path can be run 
at any CAS latency as long as the re-synchronizing output is built to support it 
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SEMICONDUCTOR MEMORY ASYNCHRONOUS PIPELINE 

The present invention relates to semiconductor memories and, more particularly, to 
a pipelined data access in a dynamic random access memory. 

5 

BACKGROUND OF THE INVENTION 

In conventional non-pipelined dynamic random access memories (DRAMs) a data 
transfer to and from the memory is performed in sequence. That is, when a read or a write 
command is received and an address is made available, the data transfer according to 

1 0 either a read or write command is performed in its entirety before another command is 

accepted by the memory. This results in subsequent commands being delayed by the time 
it takes for the current data transfer to complete. 

Historically, DRAMs have been controlled asynchronously by the processor. This 
means that the processor puts addresses on the DRAM inputs and strobes them in using 

15 the row address select signal (RAS) and column address select signal (CAS) pins. The 
addresses are held for a required minimum length of time. During this time, the DRAM 
accesses the addressed locations in memory and after a maximum delay (access time) 
either writes new data from the processor into its memory or provides data from the 
memory to its outputs for the processor to read. 

20 During this time, the processor must wait for the DRAM to perform various 

internal functions such as precharging of the lines, decoding the addresses and such like. 
This creates a "wait state" during which the higher speed processor is waiting for the 
DRAM to respond thereby slowing down the entire system. 

One solution to this problem is to make the memory circuit synchronous, that is, 

25 add input and output latches on the DRAM which can hold the data. Input latches can 
store the addresses, data, and control signals on the inputs of the DRAM, freeing the 
processor for other tasks. After a preset number of clock cycles, the data can be available 
on the output latches of a DRAM with synchronous control for a read or be written into its 
memory for a write operation. 

30 Synchronous control means that the DRAM latches information transferred 

between the processor and itself under the control of the system clock Thus, an advantage 
of the synchronous DRAMs is that the system clock is the only timing edge that must be 
provided to the memory. This reduces or eliminates propagating multiple timing strobes 
around the printed circuit board. 
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Alternatively, the DRAM may be made asynchronous. For example, suppose a 
DRAM with a 60ns delay from row addressing to data access is being used in a system 
with 10 ns clock, then the processor must apply the row address and hold it active while 
strobing it in with the (RAS) pin. This is followed 30ns later by the column address 
5 which must be held valid and strobed in with the (CAS) pin. The-processor must then 
wait for the data to appear on the outputs 30ns later, stabilize, and be read. 

On the other hand, for a synchronous interface, the processor can lock the row and 
column addresses (and control signals) into the input latches and do other tasks while 
waiting for the DRAM to perform the read operation under the control of the system 

10 clock. When the outputs of the DRAM are clocked six cycles (60ns) later, the desired data 
is in the output latches. 

A synchronous DRAM architecture also makes it possible to speed up the average 
access time of the DRAM by pipelining the addresses. In this case, it is possible to use the 
input latch to store the next address which the processor while the DRAM is operating on 

15 the previous address. Normally, the addresses to be accessed are known several cycles in 
advance by the processor. Therefore, the processor can send the second address to the 
input address latch of the DRAM to be available as soon as the first address has moved on 
to the next stage of processing in the DRAM. This eliminates the need for the processor to 
wait a full access cycle before starting the next access to the DRAM. 

20 An example of a three stage column address pipeline is shown in the schematic 

diagram of figure 1(a). The column address-to-output part is a three stage pipeline. The 
address buffer is the first latch. The column switch is the second latch and the output 
buffer is the third latch. The latency inherent in the column access time is therefore 
divided up between these three stages. 

25 The operation of pipelined read may be explained as follows: the column address 

(1) is clocked into the address buffer on one clock cycle and is decoded. On the second 
clock cycle, the column switch transfers the corresponding data (Dl) from the sense 
amplifier to the read bus and column address (A2) is clocked into the address buffer. On a 
clock three, the data (Dl) is clocked into the output buffer, (D2) is transferred to the read 

30 bus and A3 is clocked into the column address buffer. When Dl appears at the output, D2 
and D3 are in the pipeline behind it. For a more detailed discussion of the present 
technology, the reader is referred to a book entitled "High Performance Memories" by 
Betty Prince. 
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The delay in the number of clock cycles between the latching CAS in a SDRAM 
and the availability of the data bus is the "CAS latency" of the SDRAM. If the output 
data is available by the second leading edge of the clock following arrival of a column 
address, the device is described as having a CAS latency of two. Similarly, if the data is 
available at the third leading edge of the clock following the arrival of the first read 
command, the device is known as having a "CAS latency" of three. 

Synchronous DRAMs (SDRAM) come with programmable CAS latencies. As 
described above, the CAS latency determines at which clock edge cycle data will be 
available after a read command is initiated, regardless of the clock rate (CLK). The 
programmable CAS latencies enable SDRAMs to be efficiently utilized in different 
memory systems having different system clock frequencies without affecting the CAS 
latency. 

There are other ways to divide an SDRAM data path into latency stages. A wave 
pipeline is shown schematically in figure 1(b). A regular clocked pipeline has the 
disadvantage that the read latency will be equal to the delay of the slowest pipeline stage 
(i.e. longest delay) multiplied by the number of pipeline stages. A clocked pipeline with 
adjusted clocks uses clock signals that have been adjusted to each pipeline stage so that 
longer pipeline stages may be accommodated without impacting the read latency. A 
longer pipeline stage will be ended with a clock that is more delayed than the clock that 
starts the pipeline stage. A shorter pipeline stage will be started with a clock that is more 
delayed than the clock that ends the pipeline stage. A disadvantage of this scheme is that 
different adjustments to the clock are needed for each CAS latency supported by the chip. 
Also, architecture changes can have a large impact on the breakdown of the latency stages, 
requiring designers to readjust all the clocks to accommodate the new division of latency 
stages. 

Furthermore there are a limited number of places where a latency stage can be 
inserted without adding extra latency or chip area. Multiple latency stages have a 
disadvantage in that not all latency stages will be equal in the time needed for signals to 
propagate through the stage. Another complication is the need to enable or disable latency 
stages depending on the CAS latency at which the chip has been programmed to operate. 

In the wave pipeline of figure 1(b) runs pulses of data through the entire read data 
path. A wave pipeline relies on an ideal data path length, that is it assumes that all data 
paths are equal. However, data retrieved from certain memory cells in a memory array 
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will be inherently faster than data retrieval from other memory cells. This is primarily due 
to the physical location of the memory cells relative to both the read in and read out data 
path. Thus data must be resynchronized before being output from the chip. This data path 
skew makes it difficult to safely resynchronize the retrieved data in a wave pipeline 
5 implementation. 

If address signals are applied to a data path with a cycle time which exceeds the 
memory access time, then the data which is read from the memory is not output during the 
inherent delay of the memory core. In other words, in the wave pipeline technique address 
input signals are applied with a period, which is less than the critical path of the memory 
10 core section. 

Furthermore as illustrated in figures 2(a) and 2(b) with a slow clock it is necessary 
to store the output data of the wave pipeline until the data is needed. 

SUMMARY OF THE INVENTION 
15 The present invention thus seeks to mitigate at least some of the various 

disadvantages described with respect to the current art. 

In accordance with this invention there is provided pipelined SDRAM comprising: 

(a) a memory core; 

(b) a read path, defined between an address input port and an I/O data output port; 
20 (c) a plurality of pipeline stages located in said read path, each controlled by a 

corresponding one of a plurality of asynchronous control signals; 

(d) a timing delay element for generating said asynchronous control signals; 

(e) latches associated with each of said plurality of pipeline stages responsive to 
said asynchronous control signal to latch data at each of said stages; whereby 

25 data is latched at every pipeline stage independent of said system clock. 

In accordance with a further aspect of this invention the asynchronous control 
signals are generated within the chip and optimized to the different latency stages. 

A still further aspect of the invention provides stages that are independent of the 
system clock thereby allowing the read data path to be run at any CAS latency which may 
30 be supported by a suitable resynchronizing output. 

A still further aspect of the invention provides for a synchronization circuit 
coupled to the end of the read data path for synchronizing the output data to a system 
clock 
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BRIEF DESCRIPTION OF THE DRAWINGS 

A better understanding of the invention will be obtained by reference to the 
detailed description below in conjunction with the following drawings in which: 
5 Figure 1(a) is a schematic diagram of a conventional clocked pipeline memory 

circuit; 

Figure 1(b) is a schematic diagram of a conventional wave pipeline memory 

circuit; 

Figures 2(a) and 2(b) are timing waveforms for a SDRAM having a CAS latency 
10 of 3 running under fast and slow clock conditions respectively; 

Figure 3 is a schematic diagram of a generalized embodiment of the present 
invention; 

Figure 4 is a more detailed schematic diagram of the generalized embodiment of 
figure 3; 

1 5 Figure 5 is a timing waveform diagram according to a first embodiment of the 

present invention; 

Figures 6(a), 6(b) and 6(c) show detailed circuit diagrams of a pipe control circuit 
according to an embodiment of the present invention; 

Figures 7(a), 7(b) and 7(c) show detailed circuit diagrams for a pipe latch and data 
20 output latch according to an embodiment of the present invention; and 

Figure 8 is a schematic diagram of a data output control circuit according to an 
embodiment of the present invention. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
25 In the following discussion, like numerals refer to like elements in the figures and 

signals asserted low are indicated interchangeably with an x or an overscore associated 
with the corresponding signal name. Referring now to Figure 3, a schematic diagram of a 
pipelined semiconductor memory according to a generalized embodiment of the invention 
is shown generally by numeral 20. The memory includes a core memory array 22 having 
30 a read path 24, defined between an address input port 25 and a data output 32. The read 
path 24 is broken up into latency stages 27, each of which is latched by respective 
asynchronous control signals 28. A synchronizing circuit 30 is coupled to the last latency 
stage of the read path in order to resynchronize the data to the system clock CLK at output 



5 



WO 99/50852 PCT/CA99/00282 

32 of the read path. The data is synchronized to the system clock CLK a predetermined 
number of clock cycles after the application of an address signal A to the address input 25, 
i.e depending on the CAS latency of the system. The segmentation of the read path 23 
into the three main latency stages, each controlled by respective control signals 28 
5 illustrates, in general, the combining of clocked and wave pipeline techniques to achieve 
an asynchronous pipeline implementation according to the invention which exhibits less 
skew than a conventional wave pipeline but which allows for operation with any CAS 
latency without having to adjust individual clocks in each stage as would be required in a 
clocked pipeline implementation. The description with respect to figure 3 serves to 

10 provide an overview of the detailed circuits discussed below. 

Thus, referring to figure 4 a detailed schematic diagram of the generalized 
embodiment of figure 3 is shown by numeral 40. The latency stages 26 in figure 3 include 
an address input latch 42, for receiving an address signal A* at the address input port 25, 
the output of which is coupled to an address pre-decoder latch 44 which is in turn 

15 connected to a column address decoder latch 46. The column address decoder latch 46 
decodes the address signal and is coupled to select memory cells 48 in the memory cell 
array 22. The column address decoder 46 activates relevant sense amplifiers (not shown) 
for detecting the data read out from a selected memory cell 48. The output of the sense 
amplifiers is coupled to a read main amplifier block 50 via a local databus DB, which is 

20 32-bits wide in this embodiment. The output of the read main amplifier 50 is coupled to a 
global databus GDB. A multiplexer 52 multiplexes the GDB onto an I/O databus IODB, 
which is in turn coupled to a read databus amplifier RDBAMP latch block 54. 

The synchronizing circuit 30 of figure 3 is comprised of pipe latches 56, an output 
buffer 58 and control circuitry shown by block 61 . More specifically, the output from the 

25 RDBAMP latch is selectively coupled to the input of three pipe latches pipe JatchO, 

pipejatchl and pipe_latch2 as will be described below. The outputs from the pipe latches 
are connected together and coupled to the input of the output buffer 58. 

The memory also includes a command latch circuit 62 having a clock input 
terminal coupled to the system clock CLK and a command input terminal for receiving 

30 command signals such as RAS , CAS , CS . The command latch 62 provides a first 

control signal 64, which is run through a series of control logic and delay elements Tl to 
T4. Each of the delay elements Tl, T2, T3 and T4 produce respective delayed control 
signals that are fed to an input latch terminal of the pre-decoder latch 44, the Y decoder 
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46, the RMA 50 and the RDBAMP latch 54, respectively. These signals serve as 
individual asynchronous control signals for these circuits. On the other hand, the address 
latch clock input is derived directly from the system clock signal CLK. 

Control of the pipe latches pipeJatchO, pipejatchl and pipejatch2 is provided by 
5 the pipe latch control circuitry 61. Each pipe latch is driven by a respective pipe latch 

enable signal, latch_enx(0), latch_enx(l) and latch_enx(2) coupled to its latch input enable 
terminal. The pipe latch enable signals are derived from a pipe counter 64 which 
produces three count signals COUNT. The pipe counter is a free running counter which 
resets its count based on the total number of pipe latches. After a preset number of clock 
10 counts set by the system clock signal coupled to the pipe counter clock input terminal. 

The output COUNT signals from the pipe counter are coupled via count delay elements 66 
to count synchronization latches 68. The outputs from the three latches 68 provide the 
pipe latch enable signal for clocking the appropriate pipe latch 56. The clock input enable 
terminal of the lathes 68 are coupled to the asynchronous control signal of the latency 
1 5 stage in the read path, in this case, signal IODB_READX of the RDBAMP 54 to ensure 
the pipe latch is latched at the appropriate time. 

Alternatively, a more accurate synchronization of the data IODB_READX and the 
CNT_DEL signals in latch 68 can be achieved as follows: 

the count delay circuitry 66 could be segmented into multiple delay stages, each 
20 receiving control logic enable signals such as YSG or Y_EXTRD. The timing relationship 
between the address propagation and data retrieval and the clock count delay would 
therefore be more closely matched. 

Additionally, the output COUNT of pipe counter 64 is connected to a pipe delay 
element 70 for generating a pipe latch output enable signal QEN_RISEX which is 
25 connected to the respective output enable terminal of the pipe latches 56. A CLKJO 
signal which is DLL generated and slightly leads the system clock CLK, is coupled to an 
output enable terminal of the pipe delay and the output buffer 58. The DLL (delay locked 
loop) ensures that CLK JO will enable the output buffer to properly synchronize data with 
the system clock edge. 

30 The operation of the circuit will be explained as follows with reference to the 

timing diagram shown in Figure 5. At time to of the system clock signal CLK the address 
latch 42 latches the external address signal A i5 which is then free to propagate to pre- 
decoder latch 44 which latches the address after a delay Tl set by the delay element Tl. 
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These address signals are decoded in the Y decoder 46 and latched by the signal YSG 
delayed from CLK by Tl and T2. At this time the appropriate columns are activated and 
data is read out from the memory cells into column sense amplifiers and then latched in 
the RMA 50 by the IOREAD signal which is delayed from CLK by Tl + T2 + T3. 
Shortly thereafter, the data is available on the global data bus GDB. The RDBAMP 54 
may now be latched at time ti by signal IODBJtEAD that is delayed from IOREAD by 
T4, to provide the DOUTE signal. 

In general as described above, these asynchronous control signals are used to 
control the pipeline stages. These signals control when data is read into the latch (usually 
a latched amplifier). Once read into the latch, data is free to propagate toward the next 
stage. Each control signal is generated by delaying the control signal from the previous 
latency stage. The first stage is started by the external clock CLK. The next stage will 
latch data from the previous stage on the control signal that is delayed from the external 
clock. It may be noted that some of these delays are inherent in the circuits used to control 
whether a read is to take place, while some of the delays are deliberately added using 
timing delay elements. These are usually comprised of buffers sized to run slowly and 
which may include additional resistive or capacitive elements. 

Thus the delays Tl to T4 can be optimized to the particular memory independent 
of the external clock timing. The delay for each of these latches is selected to 
accommodate the propagation delays between these blocks. Thus the clock signal applied 
to the read main amplifier latch is synchronized and delayed from the clock signal applied 
to the column decoder latch to accommodate the lag in retrieving data from different areas 
of the memory array 22 to the read main amplifier 50. 

The data once latched in the RDBAMP 54 at time ti, must as with the 
conventional wave pipelines, be resynchronized to the system clock CLK at the output 32 
of the memory. This is accomplished as follows. The pipe latches 56 allow data to be 
stored in the event of fast data or a slow clock. Generally, the number of latches needed is 
equivalent to the number of latency stages to be supported. Each time a read is performed, 
a COUNT signal, one of these is shown in figure 5, is delayed asynchronously by the 
count delay element 66 and clocked by the control signal for the last stage in this case 
IODB_READ into a clock synchronizing latch 68. This time delayed COUNT signal 
generates LATCH_EN which determines which of the latches 56 the data from 
RDBAMP 54 is to be stored in. Furthermore the COUNT signal is also delayed by the 
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appropriate number of clock cycles, as determined by the current CAS latency to which 
the chip is programmed. This clock delayed COUNT signal shown as QENRISE in 
figure 5 controls which of the latches 56 has its output enabled to output data to the output 
buffer 58. Once COUNT has been set, after the delay through count delay circuitry 66, a 
5 CNTJDEL signal is generated which is combined in the clock synchronizing latch 68 with 
the IODB_READX signal to generate the LATCH_ENX signal. After the predetermined 
clock delay in the pipe delay circuit to QEN_RISEX is asserted allowing output form the 
latch containing the data for the appropriate clock cycle. The latches 56 work as a FIFO 
register, with the first data input to one of the set of latches 56, being the first data to be 
1 0 output from the set of latches. 

Thus from the above description it may be seen that the latches in the read path, 
segment the path into latency stages of an asynchronous pipeline. The chip architecture 
and the maximum operating frequency determine the number and placement of these 
stages. In general, a higher operating frequency will require a large number of shorter 
1 5 pipeline stages. Other techniques can be used such as doubling the number of data paths 
in a stage and alternating between the data paths. For example, a read output from the 
sense amplifiers is alternated between two data buses. This is described in Mosaid Patent 
No. 5,416,743. The placement of the stages will generally be dictated by the position of 
amplifiers or buffers, which may be converted into latches without resulting in extensive 
20 area penalty. For clarity, in the previous and following discussion latency stages refer to 
any circuit element capable of introducing a delay in the signal or data path. 

Turning now to figures 6 to 8, a detailed implementation of the generalized 
embodiment of figure 4 is shown. Accordingly, referring to figure 6a, the pipe control 
circuitry 61 includes a pipe counter 90, a detailed schematic of which is shown in figure 
25 6b, for producing a two digit binary count, COUNT 0 and COUNT 1 , determined by the 
input system clock frequency at its clock input terminal CLK. Each of the count lines, 
COUNT 1 and COUNT 0 are connected to respective count delay elements 92 and 94. 
The delayed count signals are connected to a count decoder 96 which decodes the input 
binary count to activate one of the three count delay lines 98, CNT0_DEL, CNT1_DEL, 
30 CNT2_DEL. The signals on these delayed count lines 98 correspond to the COUNT 
signal as shown in figure 5. In figure 5, all elements were shown with only one of the 
three components for simplicity with the exception of the three pipe latches. The delayed 
COUNT signals 98 are coupled to the inputs of respective clocked latches 100, the outputs 
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of which are buffered and provide the respective latch enable signal referred to in figure 5, 
LATCHJENX(O), LATCHJENX(l), LATCHJEN(2). The clock input terminal of these 

latches 100 is coupled to the IODBJREAD asynchronous control signal from the last 
latency stage via an inverter. 
5 The pipe counter 90 also has its output connected to a second decoder 102 also 

providing respective count signals, CNT 0, CNT 1 and CNT 2, which are coupled to 
respective pipe delay elements 104, 106 and 108. A detailed circuit diagram of the pipe 
delay circuit implementation is shown in figure 6c. The output of the pipe delay is 

controlled by a CLK_IO signal and generates the QENJUSE signal referred to in figure 5 
10 connected to the output latch enable of the pipe latches 56. Corresponding 

QEN_F ALL signals are generated for the falling edge of the system clock whereas 

QENJRISE corresponds to the rising edge of the system clock. 

Referring to figures 7a and 7b, a detailed schematic of the pipe latches 56 and the 
output buffer circuitry is shown. As may be seen in figure 7a, the data bits from the IODB 

15 databus are received at the input of the RDB amplifiers 110. Two RDBAMPS are shown 
in this implementation because of the double data rate (DDR) where data is clocked on 
both the rise and fall edges of the system clock. The outputs from the RDBAMPS are 
connected to a series of six pipe latches 1 12 to 122. Six latches are required instead of 
three due to the DDR implementation. The enable inputs of the pipe latches 1 12 to 122 

20 are coupled to the respective latch enable signals derived from the circuit of figure 6a. 
The top three pipe latches 1 12 to 116 have their outputs connected to inputs of a 3 OR 2 
NAND gate 124. Similarly, the bottom three latches 118 to 122 have the outputs 
connected to a 3 OR 2 NAND gates 126. The QEN_RISE signal is connected to the 
inputs of the 3 OR 2 NAND gate 124, the output of which, when enabled, couples data to 

25 the DOUTJUSE, DOUT_RISEX input of the output buffer shown in figure 7b. As may 
also be seen in figure 7a, a system clock control signal EDGE is provided for directing 
data to the top three or bottom three latches, once again a DDR feature. Also, for a fast 
system clock relative to the speed of the data path the 3 OR 2 NAND gates 124 or 126 will 
be already on thus allowing data to pass through to the output buffer from the pipe latches. 

30 In the alternative, with a slow clock, the system receives the data and waits for the clock, 
thus utilizing the 3 OR 2 NAND gates 124 or 126. Turning back to figure 7b, the data 
output buffer 58 as shown in figure 4 is comprised of data output latches 130 to 136. The 
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input enable terminals of the data output latches 130 to 136 are coupled to the CLKJO 
signal for synchronizing to the correct system clock edge. A detailed circuit 
implementation of the pipe latches 1 12 to 122 is shown in figure 7c. 

Thus, it may be seen that the present invention provides a flexible method for 
implementing a pipelined semiconductor memory, which can easily accommodate both a 
fast and slow system clock. Furthermore, the flexible design allows further segmentation 
of the read path for more precise matching of internal signals. Furthermore, various CAS 
latencies may be accommodated by simply delaying the output from the pipe delay 
element 70 to wait a specific number of clock cycles before clocking the data out. 

Although the invention has been described with reference to certain specific 
embodiments, various modifications thereof will be apparent to those skilled in the art 
without departing from the spirit and scope of the invention as outlined in the claims 
appended hereto. 
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THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE 
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS: 

1 . A pipelined synchronous dynamic random access memory comprising: 

(a) a memory core having addressable memory elements; 

(b) a read path, defined between an address input port and an I/O data port; said 
memory core included in said read path and having one or more pipeline 
stages, each pipeline stage being controlled by a corresponding asynchronous 
control signal; 

(c) delay elements for generating said asynchronous control signals; 

(d) latches associated with each of said pipeline stages responsive to at least one of 
said asynchronous control signal to latch data at each of said stages; whereby 
data is latched at every pipeline stage independently of a system clock. 

2. A memory as defined in claim 1, including a synchronization circuit coupled to 
said I/O port for synchronizing the output data to a system clock. 

3. A memory as defined in claim 2, said synchronization circuit including a plurality 
of pipe latches coupled in parallel, and each responsive to respective pipe control 
signals for sequentially inputting data into successive latches. 

4. A memory as defined in claim 3, said pipe control signals being generated by a 
pipe counter, said counter including pipe delay elements coupled to an output 
thereof. 

5. A memory as defined in claim 4, said pipe delay elements for generating a delay 
equivalent to a sum of said latency stage delays. 

6. A method for pipelining a synchronous dynamic random access memory; said 
method comprising the steps of: 

(a) defining a read path between an address input port and an I/O data port of a 
memory core having addressable memory elements, said path including one or 
more pipeline stages; 

(b) latching data from said I/O port in response to a system clock; 

(c) generating asynchronous control signals from a master control signal; and 

(d) controlling said pipeline stages with said asynchronous control signals whereby 
data latched in each said pipeline stage is timed independently of said system 
clock. 
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